Overview

Dataset statistics

Number of variables12
Number of observations150000
Missing cells33655
Missing cells (%)1.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory13.7 MiB
Average record size in memory96.0 B

Variable types

Numeric11
Categorical1

Warnings

NumberOfTime30-59DaysPastDueNotWorse is highly correlated with NumberOfTimes90DaysLate and 1 other fieldsHigh correlation
NumberOfTimes90DaysLate is highly correlated with NumberOfTime30-59DaysPastDueNotWorse and 1 other fieldsHigh correlation
NumberOfTime60-89DaysPastDueNotWorse is highly correlated with NumberOfTime30-59DaysPastDueNotWorse and 1 other fieldsHigh correlation
NumberOfTime30-59DaysPastDueNotWorse is highly correlated with NumberOfTime60-89DaysPastDueNotWorse and 1 other fieldsHigh correlation
NumberOfTime60-89DaysPastDueNotWorse is highly correlated with NumberOfTime30-59DaysPastDueNotWorse and 1 other fieldsHigh correlation
NumberOfTimes90DaysLate is highly correlated with NumberOfTime30-59DaysPastDueNotWorse and 1 other fieldsHigh correlation
MonthlyIncome has 29731 (19.8%) missing values Missing
NumberOfDependents has 3924 (2.6%) missing values Missing
RevolvingUtilizationOfUnsecuredLines is highly skewed (γ1 = 97.63157449) Skewed
NumberOfTime30-59DaysPastDueNotWorse is highly skewed (γ1 = 22.59710756) Skewed
DebtRatio is highly skewed (γ1 = 95.15779287) Skewed
MonthlyIncome is highly skewed (γ1 = 114.0403179) Skewed
NumberOfTimes90DaysLate is highly skewed (γ1 = 23.08734547) Skewed
NumberOfTime60-89DaysPastDueNotWorse is highly skewed (γ1 = 23.33174312) Skewed
df_index is uniformly distributed Uniform
df_index has unique values Unique
RevolvingUtilizationOfUnsecuredLines has 10878 (7.3%) zeros Zeros
NumberOfTime30-59DaysPastDueNotWorse has 126018 (84.0%) zeros Zeros
DebtRatio has 4113 (2.7%) zeros Zeros
MonthlyIncome has 1634 (1.1%) zeros Zeros
NumberOfOpenCreditLinesAndLoans has 1888 (1.3%) zeros Zeros
NumberOfTimes90DaysLate has 141662 (94.4%) zeros Zeros
NumberRealEstateLoansOrLines has 56188 (37.5%) zeros Zeros
NumberOfTime60-89DaysPastDueNotWorse has 142396 (94.9%) zeros Zeros
NumberOfDependents has 86902 (57.9%) zeros Zeros

Reproduction

Analysis started2021-11-07 15:06:26.513172
Analysis finished2021-11-07 15:07:00.034434
Duration33.52 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct150000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean75000.5
Minimum1
Maximum150000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2021-11-07T23:07:00.642520image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7500.95
Q137500.75
median75000.5
Q3112500.25
95-th percentile142500.05
Maximum150000
Range149999
Interquartile range (IQR)74999.5

Descriptive statistics

Standard deviation43301.41453
Coefficient of variation (CV)0.5773483447
Kurtosis-1.2
Mean75000.5
Median Absolute Deviation (MAD)37500
Skewness0
Sum1.1250075 × 1010
Variance1875012500
MonotonicityStrictly increasing
2021-11-07T23:07:00.785931image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
1078061
 
< 0.1%
95181
 
< 0.1%
156611
 
< 0.1%
136121
 
< 0.1%
33711
 
< 0.1%
13221
 
< 0.1%
74651
 
< 0.1%
54161
 
< 0.1%
279431
 
< 0.1%
Other values (149990)149990
> 99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
ValueCountFrequency (%)
1500001
< 0.1%
1499991
< 0.1%
1499981
< 0.1%
1499971
< 0.1%
1499961
< 0.1%
1499951
< 0.1%
1499941
< 0.1%
1499931
< 0.1%
1499921
< 0.1%
1499911
< 0.1%

SeriousDlqin2yrs
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.1 MiB
0
139974 
1
 
10026

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters150000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0139974
93.3%
110026
 
6.7%

Length

2021-11-07T23:07:01.081540image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-07T23:07:01.171312image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0139974
93.3%
110026
 
6.7%

Most occurring characters

ValueCountFrequency (%)
0139974
93.3%
110026
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number150000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0139974
93.3%
110026
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Common150000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0139974
93.3%
110026
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII150000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0139974
93.3%
110026
 
6.7%

RevolvingUtilizationOfUnsecuredLines
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct125728
Distinct (%)83.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.048438055
Minimum0
Maximum50708
Zeros10878
Zeros (%)7.3%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2021-11-07T23:07:01.261830image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.029867442
median0.154180737
Q30.5590462475
95-th percentile0.9999999
Maximum50708
Range50708
Interquartile range (IQR)0.5291788055

Descriptive statistics

Standard deviation249.7553706
Coefficient of variation (CV)41.29254005
Kurtosis14544.71341
Mean6.048438055
Median Absolute Deviation (MAD)0.148325347
Skewness97.63157449
Sum907265.7082
Variance62377.74516
MonotonicityNot monotonic
2021-11-07T23:07:01.401132image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
010878
 
7.3%
0.999999910256
 
6.8%
117
 
< 0.1%
0.95009988
 
< 0.1%
0.713147416
 
< 0.1%
0.0079840326
 
< 0.1%
0.9540918166
 
< 0.1%
0.7964071865
 
< 0.1%
0.8502994015
 
< 0.1%
0.5389221565
 
< 0.1%
Other values (125718)128808
85.9%
ValueCountFrequency (%)
010878
7.3%
8.37 × 10-61
 
< 0.1%
9.93 × 10-61
 
< 0.1%
1.25 × 10-51
 
< 0.1%
1.43 × 10-51
 
< 0.1%
1.49 × 10-51
 
< 0.1%
1.51 × 10-51
 
< 0.1%
1.6 × 10-51
 
< 0.1%
1.64 × 10-51
 
< 0.1%
1.87 × 10-51
 
< 0.1%
ValueCountFrequency (%)
507081
< 0.1%
291101
< 0.1%
221981
< 0.1%
220001
< 0.1%
205141
< 0.1%
183001
< 0.1%
174411
< 0.1%
139301
< 0.1%
134981
< 0.1%
134001
< 0.1%

age
Real number (ℝ≥0)

Distinct86
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.29520667
Minimum0
Maximum109
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2021-11-07T23:07:01.552611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile29
Q141
median52
Q363
95-th percentile78
Maximum109
Range109
Interquartile range (IQR)22

Descriptive statistics

Standard deviation14.77186586
Coefficient of variation (CV)0.2824707426
Kurtosis-0.4946688326
Mean52.29520667
Median Absolute Deviation (MAD)11
Skewness0.1889945451
Sum7844281
Variance218.2080211
MonotonicityNot monotonic
2021-11-07T23:07:01.688143image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
493837
 
2.6%
483806
 
2.5%
503753
 
2.5%
633719
 
2.5%
473719
 
2.5%
463714
 
2.5%
533648
 
2.4%
513627
 
2.4%
523609
 
2.4%
563589
 
2.4%
Other values (76)112979
75.3%
ValueCountFrequency (%)
01
 
< 0.1%
21183
 
0.1%
22434
 
0.3%
23641
 
0.4%
24816
0.5%
25953
0.6%
261193
0.8%
271338
0.9%
281560
1.0%
291702
1.1%
ValueCountFrequency (%)
1092
 
< 0.1%
1071
 
< 0.1%
1051
 
< 0.1%
1033
 
< 0.1%
1023
 
< 0.1%
1013
 
< 0.1%
999
< 0.1%
986
 
< 0.1%
9717
< 0.1%
9618
< 0.1%

NumberOfTime30-59DaysPastDueNotWorse
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4210333333
Minimum0
Maximum98
Zeros126018
Zeros (%)84.0%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2021-11-07T23:07:01.809508image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum98
Range98
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4.192781272
Coefficient of variation (CV)9.958311944
Kurtosis522.3765449
Mean0.4210333333
Median Absolute Deviation (MAD)0
Skewness22.59710756
Sum63155
Variance17.57941479
MonotonicityNot monotonic
2021-11-07T23:07:01.913508image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
0126018
84.0%
116033
 
10.7%
24598
 
3.1%
31754
 
1.2%
4747
 
0.5%
5342
 
0.2%
98264
 
0.2%
6140
 
0.1%
754
 
< 0.1%
825
 
< 0.1%
Other values (6)25
 
< 0.1%
ValueCountFrequency (%)
0126018
84.0%
116033
 
10.7%
24598
 
3.1%
31754
 
1.2%
4747
 
0.5%
5342
 
0.2%
6140
 
0.1%
754
 
< 0.1%
825
 
< 0.1%
912
 
< 0.1%
ValueCountFrequency (%)
98264
0.2%
965
 
< 0.1%
131
 
< 0.1%
122
 
< 0.1%
111
 
< 0.1%
104
 
< 0.1%
912
 
< 0.1%
825
 
< 0.1%
754
 
< 0.1%
6140
0.1%

DebtRatio
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct114194
Distinct (%)76.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean353.0050758
Minimum0
Maximum329664
Zeros4113
Zeros (%)2.7%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2021-11-07T23:07:02.040872image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.004329004
Q10.1750738323
median0.366507841
Q30.8682537732
95-th percentile2449
Maximum329664
Range329664
Interquartile range (IQR)0.693179941

Descriptive statistics

Standard deviation2037.818523
Coefficient of variation (CV)5.772774
Kurtosis13734.28886
Mean353.0050758
Median Absolute Deviation (MAD)0.2457227975
Skewness95.15779287
Sum52950761.36
Variance4152704.333
MonotonicityNot monotonic
2021-11-07T23:07:02.177225image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04113
 
2.7%
1229
 
0.2%
4174
 
0.1%
2170
 
0.1%
3162
 
0.1%
5143
 
0.1%
9125
 
0.1%
10117
 
0.1%
7115
 
0.1%
13114
 
0.1%
Other values (114184)144538
96.4%
ValueCountFrequency (%)
04113
2.7%
2.6 × 10-51
 
< 0.1%
3.69 × 10-51
 
< 0.1%
3.93 × 10-51
 
< 0.1%
6.62 × 10-51
 
< 0.1%
7.5 × 10-51
 
< 0.1%
8 × 10-51
 
< 0.1%
8.57 × 10-51
 
< 0.1%
9.09 × 10-51
 
< 0.1%
9.15 × 10-51
 
< 0.1%
ValueCountFrequency (%)
3296641
< 0.1%
3264421
< 0.1%
3070011
< 0.1%
2205161
< 0.1%
1688351
< 0.1%
1109521
< 0.1%
1068851
< 0.1%
1013201
< 0.1%
619071
< 0.1%
61106.51
< 0.1%

MonthlyIncome
Real number (ℝ≥0)

MISSING
SKEWED
ZEROS

Distinct13594
Distinct (%)11.3%
Missing29731
Missing (%)19.8%
Infinite0
Infinite (%)0.0%
Mean6670.221237
Minimum0
Maximum3008750
Zeros1634
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2021-11-07T23:07:02.343175image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1300
Q13400
median5400
Q38249
95-th percentile14587.6
Maximum3008750
Range3008750
Interquartile range (IQR)4849

Descriptive statistics

Standard deviation14384.67422
Coefficient of variation (CV)2.15655129
Kurtosis19504.7054
Mean6670.221237
Median Absolute Deviation (MAD)2317
Skewness114.0403179
Sum802220838
Variance206918852.3
MonotonicityNot monotonic
2021-11-07T23:07:02.479991image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50002757
 
1.8%
40002106
 
1.4%
60001934
 
1.3%
30001758
 
1.2%
01634
 
1.1%
25001551
 
1.0%
100001466
 
1.0%
35001360
 
0.9%
45001226
 
0.8%
70001223
 
0.8%
Other values (13584)103254
68.8%
(Missing)29731
 
19.8%
ValueCountFrequency (%)
01634
1.1%
1605
 
0.4%
26
 
< 0.1%
42
 
< 0.1%
52
 
< 0.1%
71
 
< 0.1%
91
 
< 0.1%
102
 
< 0.1%
111
 
< 0.1%
151
 
< 0.1%
ValueCountFrequency (%)
30087501
< 0.1%
17940601
< 0.1%
15601001
< 0.1%
10725001
< 0.1%
8350401
< 0.1%
7304831
< 0.1%
7025001
< 0.1%
6995301
< 0.1%
6495871
< 0.1%
6290001
< 0.1%

NumberOfOpenCreditLinesAndLoans
Real number (ℝ≥0)

ZEROS

Distinct58
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.45276
Minimum0
Maximum58
Zeros1888
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2021-11-07T23:07:02.619745image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q15
median8
Q311
95-th percentile18
Maximum58
Range58
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.14595099
Coefficient of variation (CV)0.6087894356
Kurtosis3.091066746
Mean8.45276
Median Absolute Deviation (MAD)3
Skewness1.21531378
Sum1267914
Variance26.48081159
MonotonicityNot monotonic
2021-11-07T23:07:02.756824image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
613614
 
9.1%
713245
 
8.8%
512931
 
8.6%
812562
 
8.4%
411609
 
7.7%
911355
 
7.6%
109624
 
6.4%
39058
 
6.0%
118321
 
5.5%
127005
 
4.7%
Other values (48)40676
27.1%
ValueCountFrequency (%)
01888
 
1.3%
14438
 
3.0%
26666
4.4%
39058
6.0%
411609
7.7%
512931
8.6%
613614
9.1%
713245
8.8%
812562
8.4%
911355
7.6%
ValueCountFrequency (%)
581
 
< 0.1%
572
 
< 0.1%
562
 
< 0.1%
544
< 0.1%
531
 
< 0.1%
523
< 0.1%
512
 
< 0.1%
502
 
< 0.1%
494
< 0.1%
486
< 0.1%

NumberOfTimes90DaysLate
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2659733333
Minimum0
Maximum98
Zeros141662
Zeros (%)94.4%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2021-11-07T23:07:02.902720image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum98
Range98
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4.169303788
Coefficient of variation (CV)15.67564588
Kurtosis537.7389446
Mean0.2659733333
Median Absolute Deviation (MAD)0
Skewness23.08734547
Sum39896
Variance17.38309407
MonotonicityNot monotonic
2021-11-07T23:07:03.006765image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
0141662
94.4%
15243
 
3.5%
21555
 
1.0%
3667
 
0.4%
4291
 
0.2%
98264
 
0.2%
5131
 
0.1%
680
 
0.1%
738
 
< 0.1%
821
 
< 0.1%
Other values (9)48
 
< 0.1%
ValueCountFrequency (%)
0141662
94.4%
15243
 
3.5%
21555
 
1.0%
3667
 
0.4%
4291
 
0.2%
5131
 
0.1%
680
 
0.1%
738
 
< 0.1%
821
 
< 0.1%
919
 
< 0.1%
ValueCountFrequency (%)
98264
0.2%
965
 
< 0.1%
171
 
< 0.1%
152
 
< 0.1%
142
 
< 0.1%
134
 
< 0.1%
122
 
< 0.1%
115
 
< 0.1%
108
 
< 0.1%
919
 
< 0.1%

NumberRealEstateLoansOrLines
Real number (ℝ≥0)

ZEROS

Distinct28
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.01824
Minimum0
Maximum54
Zeros56188
Zeros (%)37.5%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2021-11-07T23:07:03.154091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile3
Maximum54
Range54
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.129770985
Coefficient of variation (CV)1.109533101
Kurtosis60.47680765
Mean1.01824
Median Absolute Deviation (MAD)1
Skewness3.482483994
Sum152736
Variance1.276382478
MonotonicityNot monotonic
2021-11-07T23:07:03.269319image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
056188
37.5%
152338
34.9%
231522
21.0%
36300
 
4.2%
42170
 
1.4%
5689
 
0.5%
6320
 
0.2%
7171
 
0.1%
893
 
0.1%
978
 
0.1%
Other values (18)131
 
0.1%
ValueCountFrequency (%)
056188
37.5%
152338
34.9%
231522
21.0%
36300
 
4.2%
42170
 
1.4%
5689
 
0.5%
6320
 
0.2%
7171
 
0.1%
893
 
0.1%
978
 
0.1%
ValueCountFrequency (%)
541
 
< 0.1%
321
 
< 0.1%
291
 
< 0.1%
261
 
< 0.1%
253
< 0.1%
232
< 0.1%
211
 
< 0.1%
202
< 0.1%
192
< 0.1%
182
< 0.1%

NumberOfTime60-89DaysPastDueNotWorse
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2403866667
Minimum0
Maximum98
Zeros142396
Zeros (%)94.9%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2021-11-07T23:07:03.387481image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum98
Range98
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4.155179421
Coefficient of variation (CV)17.28539889
Kurtosis545.6827435
Mean0.2403866667
Median Absolute Deviation (MAD)0
Skewness23.33174312
Sum36058
Variance17.26551602
MonotonicityNot monotonic
2021-11-07T23:07:03.512141image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0142396
94.9%
15731
 
3.8%
21118
 
0.7%
3318
 
0.2%
98264
 
0.2%
4105
 
0.1%
534
 
< 0.1%
616
 
< 0.1%
79
 
< 0.1%
965
 
< 0.1%
Other values (3)4
 
< 0.1%
ValueCountFrequency (%)
0142396
94.9%
15731
 
3.8%
21118
 
0.7%
3318
 
0.2%
4105
 
0.1%
534
 
< 0.1%
616
 
< 0.1%
79
 
< 0.1%
82
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
98264
0.2%
965
 
< 0.1%
111
 
< 0.1%
91
 
< 0.1%
82
 
< 0.1%
79
 
< 0.1%
616
 
< 0.1%
534
 
< 0.1%
4105
 
0.1%
3318
0.2%

NumberOfDependents
Real number (ℝ≥0)

MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing3924
Missing (%)2.6%
Infinite0
Infinite (%)0.0%
Mean0.7572222679
Minimum0
Maximum20
Zeros86902
Zeros (%)57.9%
Negative0
Negative (%)0.0%
Memory size1.1 MiB
2021-11-07T23:07:03.609179image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum20
Range20
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.115086071
Coefficient of variation (CV)1.472600739
Kurtosis3.001656811
Mean0.7572222679
Median Absolute Deviation (MAD)0
Skewness1.588242379
Sum110612
Variance1.243416947
MonotonicityNot monotonic
2021-11-07T23:07:03.714544image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
086902
57.9%
126316
 
17.5%
219522
 
13.0%
39483
 
6.3%
42862
 
1.9%
5746
 
0.5%
6158
 
0.1%
751
 
< 0.1%
824
 
< 0.1%
95
 
< 0.1%
Other values (3)7
 
< 0.1%
(Missing)3924
 
2.6%
ValueCountFrequency (%)
086902
57.9%
126316
 
17.5%
219522
 
13.0%
39483
 
6.3%
42862
 
1.9%
5746
 
0.5%
6158
 
0.1%
751
 
< 0.1%
824
 
< 0.1%
95
 
< 0.1%
ValueCountFrequency (%)
201
 
< 0.1%
131
 
< 0.1%
105
 
< 0.1%
95
 
< 0.1%
824
 
< 0.1%
751
 
< 0.1%
6158
 
0.1%
5746
 
0.5%
42862
 
1.9%
39483
6.3%

Interactions

2021-11-07T23:06:37.763722image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:37.981625image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:38.201651image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:38.368437image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:38.527589image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:38.685312image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:38.846242image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:39.023910image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:39.179088image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:41.054844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:41.193724image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:41.342604image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:41.479537image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:41.620155image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:41.755229image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:41.895454image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:42.044074image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:42.193395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:42.338302image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:42.495571image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:42.630811image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:42.774339image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:42.926809image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:43.065624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:43.196082image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:43.350156image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:43.496412image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:43.616485image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:43.753237image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:43.897966image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:44.028758image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:44.169766image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:44.318620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:44.471240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:44.630997image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:44.790726image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:44.956913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:45.104744image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:45.232633image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:45.366496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:45.500107image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:45.787762image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:45.926872image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:46.067324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:46.203221image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:46.374607image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:46.518220image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:46.666015image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:46.805376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:46.957275image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:47.162108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:47.407387image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:47.539073image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:47.668121image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:47.828024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:48.018512image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:48.203430image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:48.463217image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:48.617370image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:48.773873image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:48.931790image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:49.084088image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:49.218192image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:49.346567image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:49.473749image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:49.624728image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:49.774233image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:49.916227image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:50.061864image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:50.200599image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:50.346063image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:50.489746image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:50.628190image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:50.799939image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:50.960629image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:51.151221image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:51.284113image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:51.461677image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:51.664291image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:51.803417image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:52.111897image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:52.256230image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:52.419202image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:52.562539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:52.696733image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:52.836419image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:52.971289image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:53.097838image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:53.245113image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:53.418400image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:53.595611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:53.798759image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:54.000395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:54.143376image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:54.280012image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:54.431408image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:54.568161image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:54.729831image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:54.885914image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:55.055297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:55.235917image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:55.405459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:55.592281image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:55.767834image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:55.947197image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:56.136238image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:56.289081image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:56.431166image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:56.577192image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:56.727014image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:56.888779image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:57.074896image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:57.245223image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:57.399182image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:57.564172image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:57.735737image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:57.909987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:58.074353image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:58.220155image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:58.386080image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-11-07T23:06:58.563536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-11-07T23:07:03.835921image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-11-07T23:07:04.095894image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-11-07T23:07:04.368015image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-11-07T23:07:04.593238image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-11-07T23:06:58.840858image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-11-07T23:06:59.221438image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-11-07T23:06:59.631483image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-11-07T23:06:59.770451image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexSeriousDlqin2yrsRevolvingUtilizationOfUnsecuredLinesageNumberOfTime30-59DaysPastDueNotWorseDebtRatioMonthlyIncomeNumberOfOpenCreditLinesAndLoansNumberOfTimes90DaysLateNumberRealEstateLoansOrLinesNumberOfTime60-89DaysPastDueNotWorseNumberOfDependents
0110.7661274520.8029829120.0130602.0
1200.9571514000.1218762600.040001.0
2300.6581803810.0851133042.021000.0
3400.2338103000.0360503300.050000.0
4500.9072394910.02492663588.070100.0
5600.2131797400.3756073500.030101.0
6700.3056825705710.000000NaN80300.0
7800.7544643900.2099403500.080000.0
8900.11695127046.000000NaN2000NaN
91000.1891695700.60629123684.090402.0

Last rows

df_indexSeriousDlqin2yrsRevolvingUtilizationOfUnsecuredLinesageNumberOfTime30-59DaysPastDueNotWorseDebtRatioMonthlyIncomeNumberOfOpenCreditLinesAndLoansNumberOfTimes90DaysLateNumberRealEstateLoansOrLinesNumberOfTime60-89DaysPastDueNotWorseNumberOfDependents
14999014999100.0555184600.6097794335.070102.0
14999114999200.1041125900.47765810316.0100200.0
14999214999300.8719765004132.000000NaN110103.0
14999314999401.0000002200.000000820.010000.0
14999414999500.3857425000.4042933400.070000.0
14999514999600.0406747400.2251312100.040100.0
14999614999700.2997454400.7165625584.040102.0
14999714999800.2460445803870.000000NaN180100.0
14999814999900.0000003000.0000005716.040000.0
14999915000000.8502836400.2499088158.080200.0